Sequence-to-Sequence Emotional Voice Conversion With Strength Control
نویسندگان
چکیده
منابع مشابه
Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks
We propose a training framework for sequence-to-sequence voice conversion (SVC). A well-known problem regarding a conventional VC framework is that acoustic-feature sequences generated from a converter tend to be over-smoothed, resulting in buzzy-sounding speech. This is because a particular form of similarity metric or distribution for parameter training of the acoustic model is assumed so tha...
متن کاملStatistical sequence-to-frame mapping techniques for voice conversion
あらまし 話者変換の目的はある話者の声を別の話者の声に変換することである。これは二つの話者区間において音 声時系列のマッピング関数を求めることとして考えられる。GMMを用いた統計的マッピング方法 [1], [2]は話者変換 のタスクにおいてよく使われている。ただし、GMMを用いた変換技術はフレームからフレームへのマッピング関数を 使用しているので、音声時系列のコンテキスト情報が十分には使われていない。HMMは音声時系列の有効なモデル であり、音声認識や音声合成においてよく使われている。本研究は HMMを用いた音声変換を研究対象とする。我々 は HMMを用いた回帰、シーケンスからフレームの変換関数を導出した。先行の HMMを用いた音声変換方法 [3]~ [5]は強制切り出し (forced alignment)によって音声を分割し、各区間に対して変換を行う。それらの方法と異なって, 我...
متن کاملVoice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
Voice conversion (VC) using sequence-to-sequence learning of context posterior probabilities is proposed. Conventional VC using shared context posterior probabilities predicts target speech parameters from the context posterior probabilities estimated from the source speech parameters. Although conventional VC can be built from non-parallel data, it is difficult to convert speaker individuality...
متن کاملMultitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
Recently, neural sequence-to-sequence (Seq2Seq) models have been applied to the problem of grapheme-to-phoneme (G2P) conversion. These models offer a straightforward way of modeling the conversion by jointly learning the alignment and translation of input to output tokens in an end-to-end fashion. However, until now this approach did not show improved error rates on its own compared to traditio...
متن کاملGMM-based voice conversion applied to emotional speech synthesis
Voice conversion method is applied to synthesizing emotional speech from standard reading (neutral) speech. Pairs of neutral speech and emotional speech are used for conversion rule training. The conversion adopts GMM (Gaussian Mixture Model) with DFW (Dynamic Frequency Warping). We also adopt STRAIGHT, the high-quality speech analysis-synthesis algorithm. As conversion target emotions, (Hot) a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: 2169-3536
DOI: 10.1109/access.2021.3065460